Search CORE

8 research outputs found

D6.2 Integrated Final Version of the Components for Lexical Acquisition

Author: Bel N?ria
Frontini Francesca
Monachini Monica
Padr? Muntsa
Quochi Valeria
Rimell Laura
Publication venue
Publication date
Field of study

The PANACEA project has addressed one of the most critical bottlenecks that threaten the development of technologies to support multilingualism in Europe, and to process the huge quantity of multilingual data produced annually. Any attempt at automated language processing, particularly Machine Translation (MT), depends on the availability of language-specific resources. Such Language Resources (LR) contain information about the language\u27s lexicon, i.e. the words of the language and the characteristics of their use. In Natural Language Processing (NLP), LRs contribute information about the syntactic and semantic behaviour of words - i.e. their grammar and their meaning - which inform downstream applications such as MT. To date, many LRs have been generated by hand, requiring significant manual labour from linguistic experts. However, proceeding manually, it is impossible to supply LRs for every possible pair of European languages, textual domain, and genre, which are needed by MT developers. Moreover, an LR for a given language can never be considered complete nor final because of the characteristics of natural language, which continually undergoes changes, especially spurred on by the emergence of new knowledge domains and new technologies. PANACEA has addressed this challenge by building a factory of LRs that progressively automates the stages involved in the acquisition, production, updating and maintenance of LRs required by MT systems. The existence of such a factory will significantly cut down the cost, time and human effort required to build LRs. WP6 has addressed the lexical acquisition component of the LR factory, that is, the techniques for automated extraction of key lexical information from texts, and the automatic collation of lexical information into LRs in a standardized format. The goal of WP6 has been to take existing techniques capable of acquiring syntactic and semantic information from corpus data, improving upon them, adapting and applying them to multiple languages, and turning them into powerful and flexible techniques capable of supporting massive applications. One focus for improving the scalability and portability of lexical acquisition techniques has been to extend exiting techniques with more powerful, less "supervised" methods. In NLP, the amount of supervision refers to the amount of manual annotation which must be applied to a text corpus before machine learning or other techniques are applied to the data to compile a lexicon. More manual annotation means more accurate training data, and thus a more accurate LR. However, given that it is impractical from a cost and time perspective to manually annotate the vast amounts of data required for multilingual MT across domains, it is important to develop techniques which can learn from corpora with less supervision. Less supervised methods are capable of supporting both large-scale acquisition and efficient domain adaptation, even in the domains where data is scarce. Another focus of lexical acquisition in PANACEA has been the need of LR users to tune the accuracy level of LRs. Some applications may require increased precision, or accuracy, where the application requires a high degree of confidence in the lexical information used. At other times a greater level of coverage may be required, with information about more words at the expense of some degree of accuracy. Lexical acquisition in PANACEA has investigated confidence thresholds for lexical acquisition to ensure that the ultimate users of LRs can generate lexical data from the PANACEA factory at the desired level of accuracy

PUblication MAnagement

D6.5 Merged dictionaries

Author: Bel N?ria
Del Gratta Riccardo
Frontini Francesca
Monachini Monica
Padr? Muntsa
Quochi Valeria
Rimell Laura
Publication venue
Publication date
Field of study

This document presents the merged dictionaries delivered in PANACEA. Those dictionaries result from merging already existing lexica, generally for general domain, with domain specific lexica acquired using PANACEA platform. The domain specific lexica are presented and delivered in D6.3 and the merging repository that allowed the multilevel merging in D6.4

PUblication MAnagement

The European Language Resources and Technologies Forum: Shaping the Future of the Multilingual Digital Europe

Author: Baroni Paola
Bel N?ria
Budin Gerhard
Calzolari Nicoletta
Choukri Khalid
Goggi Sara
Mariani Joseph
Monachini Monica
Odijk Jan
Piperidis Stelios
Quochi Valeria
Soria Claudia
Toral Antonio
Publication venue: Istituto di Linguistica Computazionale del CNR - Pisa, ITALY
Publication date
Field of study

Proceedings of the 1st FLaReNet Forum on the European Language Resources and Technologies, held in Vienna, at the Austrian Academy of Science, on 12-13 February 2009

PUblication MAnagement

ECP-2007-LANG-617001 FLaReNet: Action Plan

Author: Baroni Paola
Bel N?ria
Budin Gerhard
Calzolari Nicoletta
Caselli Tommaso
Choukri Khalid
Goggi Sara
Mariani Joseph
Monachini Monica
Odijk Jan
Piperidis Stelios
Quochi Valeria
Soria Claudia
Toral Antonio
Publication venue
Publication date
Field of study

Action plan of the FLaReNet project

PUblication MAnagement

An Expanded 2D Fused Aromatic Network with 90-Ring Hexagons

Author: Khlobystov Andrei N.
Lerma?Berlanga Bel�n
Liu Meng
Mart�?Gastaldo Carlos
Mateo-Alonso Aurelio
Melle?Franco Manuel
Paolucci Francesco
Ria�o Alberto
Saeki Akinori
Stoppiello Craig T.
Struty?ski Karol
Valenti Giovanni
Publication venue: Wiley
Publication date: 08/11/2021
Field of study

Two-dimensional fused aromatic networks (2D FANs) have emerged as a highly versatile alternative to holey graphene. The synthesis of 2D FANs with increasingly larger lattice dimensions will enable new application perspectives. However, the synthesis of larger analogues is mostly limited by lack of appropriate monomers and methods. Herein, we describe the synthesis, characterisation and properties of an expanded 2D FAN with 90-ring hexagons, which exceed the largest 2D FAN lattices reported to date

Repository@Nottingham

ZENODO

Archivo Digital para la Docencia y la Investigación

PubMed Central

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

FLaReNet: una red para fomentar los recursos ling??sticos (Fostering Language Resources Network: FLaReNet)

Author: Bel N?ria
Calzolari Nicoletta
Publication venue: Sociedad Espa?ola para el Procesamiento del Lenguaje Natural (SEPLN)
Publication date
Field of study

FLaReNet is a thematic network whose objective is the preparation of strategies and recommendations for the promotion and development of language technologies and the associated language resources because of their importance for minimizing the impact of the linguistic diversity in a digital and multilingual Europe. The results of this joint process of reflection by researchers and professionals of all around the world will be the basisi of European agreed policies for funding and promoting this sector.FLaReNet es una red tem?tica cuyo objetivo es la elaboraci?n de estrategias y recomendaciones para la promoci?n y el desarrollo de las tecnolog?as ling??sticas y los recursos ling??sticos asociados por su importancia para minimizar el impacto de la diversidad ling??stica en una Europa digital multiling?e. Los resultados de este proceso de reflexi?n conjunta de investigadores y profesionales de todo el mundo ser?n la base para pol?ticas europeas consensuadas de financiaci?n y promoci?n del sector

PUblication MAnagement

ECP-2007-LANG-617001 FLaReNet: Progress Report No. 6

Author: Baroni Paola
Bel N?ria
Calzolari Nicoletta
Choukri Khalid
Goggi Sara
Mariani Joseph
Odijk Jan
Piperidis Stelios
Soria Claudia
Publication venue
Publication date
Field of study

Sixth semestrial report on the progress of the FLaReNet project

PUblication MAnagement

ECP-2007-LANG-617001 FLaReNet: Progress Report No. 4

Author: Baroni Paola
Bel N?ria
Budin Gerhard
Calzolari Nicoletta
Choukri Khalid
Mariani Joseph
Odijk Jan
Piperidis Stelios
Quochi Valeria
Soria Claudia
Publication venue
Publication date
Field of study

Fourth semestrial report on the progress of the FLaReNet project

PUblication MAnagement